Restoring high-resolution text images to improve legibility and OCR accuracy
Identifieur interne : 001376 ( Main/Exploration ); précédent : 001375; suivant : 001377Restoring high-resolution text images to improve legibility and OCR accuracy
Auteurs : Hirobumi Nishida [Japon]Source :
- SPIE proceedings series [ 1017-2653 ] ; 2005.
Descripteurs français
- Pascal (Inist)
English descriptors
- KwdEn :
Abstract
A new method for restoring high-resolution binary images is presented to improve legibility and OCR accuracy for low-resolution text images. The initially restored image is generated by simple techniques, and is then improved by integrating a variety of features obtained through image analysis. Missing strokes of characters are complemented based on topographic features. Contours of characters are then modified in terms of gradient magnitudes and curvatures along the contours. Finally, contours are beautified so that they look good to the human eye. The proposed method can deal with characters having complex structures such as Kanji, and entails relatively simple computation. Through experiments, it has been validated that the proposed method improves both OCR accuracy and legibility. In particular, smoothness and linearity along contours are significantly improved and strokes are restored correctly.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000456
- to stream PascalFrancis, to step Curation: 000332
- to stream PascalFrancis, to step Checkpoint: 000384
- to stream Main, to step Merge: 001414
- to stream Main, to step Curation: 001376
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Restoring high-resolution text images to improve legibility and OCR accuracy</title>
<author><name sortKey="Nishida, Hirobumi" sort="Nishida, Hirobumi" uniqKey="Nishida H" first="Hirobumi" last="Nishida">Hirobumi Nishida</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Image Appliance Lab, Ricoh Co., Ltd., 1-1-17 Koishikawa</s1>
<s2>Bunkyo-ku, Tokyo 112-0002</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">05-0361314</idno>
<date when="2005">2005</date>
<idno type="stanalyst">PASCAL 05-0361314 INIST</idno>
<idno type="RBID">Pascal:05-0361314</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000456</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000332</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000384</idno>
<idno type="wicri:doubleKey">1017-2653:2005:Nishida H:restoring:high:resolution</idno>
<idno type="wicri:Area/Main/Merge">001414</idno>
<idno type="wicri:Area/Main/Curation">001376</idno>
<idno type="wicri:Area/Main/Exploration">001376</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Restoring high-resolution text images to improve legibility and OCR accuracy</title>
<author><name sortKey="Nishida, Hirobumi" sort="Nishida, Hirobumi" uniqKey="Nishida H" first="Hirobumi" last="Nishida">Hirobumi Nishida</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Image Appliance Lab, Ricoh Co., Ltd., 1-1-17 Koishikawa</s1>
<s2>Bunkyo-ku, Tokyo 112-0002</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<placeName><settlement type="city">Tokyo</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint><date when="2005">2005</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Accuracy</term>
<term>Binary image</term>
<term>High resolution</term>
<term>Image analysis</term>
<term>Image processing</term>
<term>Image quality</term>
<term>Image resolution</term>
<term>Image restoration</term>
<term>Imaging</term>
<term>Linearity</term>
<term>Low resolution</term>
<term>Optical character recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Haute résolution</term>
<term>Résolution image</term>
<term>Reconnaissance optique caractère</term>
<term>Précision</term>
<term>Image binaire</term>
<term>Basse résolution</term>
<term>Restauration image</term>
<term>Formation image</term>
<term>Analyse image</term>
<term>Linéarité</term>
<term>Qualité image</term>
<term>Traitement image</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">A new method for restoring high-resolution binary images is presented to improve legibility and OCR accuracy for low-resolution text images. The initially restored image is generated by simple techniques, and is then improved by integrating a variety of features obtained through image analysis. Missing strokes of characters are complemented based on topographic features. Contours of characters are then modified in terms of gradient magnitudes and curvatures along the contours. Finally, contours are beautified so that they look good to the human eye. The proposed method can deal with characters having complex structures such as Kanji, and entails relatively simple computation. Through experiments, it has been validated that the proposed method improves both OCR accuracy and legibility. In particular, smoothness and linearity along contours are significantly improved and strokes are restored correctly.</div>
</front>
</TEI>
<affiliations><list><country><li>Japon</li>
</country>
<settlement><li>Tokyo</li>
</settlement>
</list>
<tree><country name="Japon"><noRegion><name sortKey="Nishida, Hirobumi" sort="Nishida, Hirobumi" uniqKey="Nishida H" first="Hirobumi" last="Nishida">Hirobumi Nishida</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001376 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001376 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:05-0361314 |texte= Restoring high-resolution text images to improve legibility and OCR accuracy }}
This area was generated with Dilib version V0.6.32. |